Thursday, December 18, 2003
LOL, wise words from the Tim Bray essays....
If you’re a data-structures kind of person you can now have some fun thinking how you’d store the words from the word list (trickier than you think), and how you store the postings (they’re fixed in size but there are more of them), and how you point from the word list to the postings list and from the postings list to the document list.
If you’re not, trust me, this is the kind of thing data-structures people live for, give them the problem and they’ll go away happily and not bother you for a while. Don’t forget to tell them that they need to allow space for somewhere between twenty thousand and a few million unique words, and many billions of documents, each with potentially lots of postings, so really a very large number indeed of postings. And, it all has to be updateable.
And the cool thing is, when you tell them those big numbers, it will make them happier! Aren’t geeks wonderful?
Full TOC here.
And that's Table Of Contents, you n00b ;-)